Dataset statistics
| Number of variables | 10 |
|---|---|
| Number of observations | 1292 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 101.1 KiB |
| Average record size in memory | 80.1 B |
Variable types
| Numeric | 8 |
|---|---|
| Categorical | 2 |
Price is highly correlated with Age0804 and 2 other fields | High correlation |
Age0804 is highly correlated with Price and 1 other fields | High correlation |
KM is highly correlated with Price and 1 other fields | High correlation |
QuarterlyTax is highly correlated with Weight | High correlation |
Weight is highly correlated with Price and 1 other fields | High correlation |
Price is highly correlated with Age0804 and 1 other fields | High correlation |
Age0804 is highly correlated with Price and 1 other fields | High correlation |
KM is highly correlated with Price and 1 other fields | High correlation |
cc is highly correlated with QuarterlyTax and 1 other fields | High correlation |
QuarterlyTax is highly correlated with cc and 1 other fields | High correlation |
Weight is highly correlated with cc and 1 other fields | High correlation |
Price is highly correlated with Age0804 | High correlation |
Age0804 is highly correlated with Price | High correlation |
cc is highly correlated with Weight | High correlation |
QuarterlyTax is highly correlated with Weight | High correlation |
Weight is highly correlated with cc and 1 other fields | High correlation |
KM is highly correlated with HP and 2 other fields | High correlation |
HP is highly correlated with KM and 4 other fields | High correlation |
Doors is highly correlated with Weight and 1 other fields | High correlation |
Weight is highly correlated with HP and 4 other fields | High correlation |
Age0804 is highly correlated with KM and 4 other fields | High correlation |
QuarterlyTax is highly correlated with HP and 4 other fields | High correlation |
Price is highly correlated with KM and 4 other fields | High correlation |
cc is highly skewed (γ1 = 26.69622404) | Skewed |
Unnamed: 0 is uniformly distributed | Uniform |
Unnamed: 0 has unique values | Unique |
Reproduction
| Analysis started | 2021-06-30 19:02:17.395470 |
|---|---|
| Analysis finished | 2021-06-30 19:02:32.258308 |
| Duration | 14.86 seconds |
| Software version | pandas-profiling v3.0.0 |
| Download configuration | config.json |
| Distinct | 1292 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 645.5 |
| Minimum | 0 |
|---|---|
| Maximum | 1291 |
| Zeros | 1 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 10.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 64.55 |
| Q1 | 322.75 |
| median | 645.5 |
| Q3 | 968.25 |
| 95-th percentile | 1226.45 |
| Maximum | 1291 |
| Range | 1291 |
| Interquartile range (IQR) | 645.5 |
Descriptive statistics
| Standard deviation | 373.1125835 |
|---|---|
| Coefficient of variation (CV) | 0.5780210434 |
| Kurtosis | -1.2 |
| Mean | 645.5 |
| Median Absolute Deviation (MAD) | 323 |
| Skewness | 0 |
| Sum | 833986 |
| Variance | 139213 |
| Monotonicity | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1291 | 1 | 0.1% |
| 403 | 1 | 0.1% |
| 425 | 1 | 0.1% |
| 426 | 1 | 0.1% |
| 427 | 1 | 0.1% |
| 428 | 1 | 0.1% |
| 429 | 1 | 0.1% |
| 430 | 1 | 0.1% |
| 431 | 1 | 0.1% |
| 432 | 1 | 0.1% |
| Other values (1282) | 1282 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 |
| Value | Count | Frequency (%) |
| 1291 | 1 | |
| 1290 | 1 | |
| 1289 | 1 | |
| 1288 | 1 | |
| 1287 | 1 | |
| 1286 | 1 | |
| 1285 | 1 | |
| 1284 | 1 | |
| 1283 | 1 | |
| 1282 | 1 |
| Distinct | 223 |
|---|---|
| Distinct (%) | 17.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10732.46904 |
| Minimum | 4400 |
|---|---|
| Maximum | 31275 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 10.2 KiB |
Quantile statistics
| Minimum | 4400 |
|---|---|
| 5-th percentile | 6900 |
| Q1 | 8450 |
| median | 9900 |
| Q3 | 11926.25 |
| 95-th percentile | 18950 |
| Maximum | 31275 |
| Range | 26875 |
| Interquartile range (IQR) | 3476.25 |
Descriptive statistics
| Standard deviation | 3622.465521 |
|---|---|
| Coefficient of variation (CV) | 0.3375239665 |
| Kurtosis | 3.240482811 |
| Mean | 10732.46904 |
| Median Absolute Deviation (MAD) | 1650 |
| Skewness | 1.649458653 |
| Sum | 13866350 |
| Variance | 13122256.45 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 8950 | 101 | 7.8% |
| 9950 | 74 | 5.7% |
| 10950 | 58 | 4.5% |
| 7950 | 54 | 4.2% |
| 11950 | 41 | 3.2% |
| 8250 | 36 | 2.8% |
| 8750 | 35 | 2.7% |
| 10500 | 32 | 2.5% |
| 7750 | 32 | 2.5% |
| 12950 | 28 | 2.2% |
| Other values (213) | 801 |
| Value | Count | Frequency (%) |
| 4400 | 1 | |
| 4450 | 1 | |
| 4750 | 1 | |
| 5150 | 1 | |
| 5250 | 2 | |
| 5600 | 1 | |
| 5740 | 1 | |
| 5750 | 2 | |
| 5751 | 1 | |
| 5800 | 1 |
| Value | Count | Frequency (%) |
| 31275 | 1 | |
| 31000 | 1 | |
| 24990 | 1 | |
| 24950 | 2 | |
| 24500 | 1 | |
| 23950 | 1 | |
| 23750 | 1 | |
| 23000 | 1 | |
| 22950 | 1 | |
| 22750 | 1 |
| Distinct | 77 |
|---|---|
| Distinct (%) | 6.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 55.97368421 |
| Minimum | 1 |
|---|---|
| Maximum | 80 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 10.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 19 |
| Q1 | 44 |
| median | 61 |
| Q3 | 70 |
| 95-th percentile | 79 |
| Maximum | 80 |
| Range | 79 |
| Interquartile range (IQR) | 26 |
Descriptive statistics
| Standard deviation | 18.54930636 |
|---|---|
| Coefficient of variation (CV) | 0.3313933434 |
| Kurtosis | -0.06457888977 |
| Mean | 55.97368421 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | -0.8310493009 |
| Sum | 72318 |
| Variance | 344.0767663 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 68 | 65 | 5.0% |
| 65 | 58 | 4.5% |
| 80 | 52 | 4.0% |
| 78 | 40 | 3.1% |
| 62 | 37 | 2.9% |
| 67 | 36 | 2.8% |
| 77 | 34 | 2.6% |
| 54 | 34 | 2.6% |
| 75 | 33 | 2.6% |
| 61 | 32 | 2.5% |
| Other values (67) | 871 |
| Value | Count | Frequency (%) |
| 1 | 2 | 0.2% |
| 2 | 2 | 0.2% |
| 4 | 2 | 0.2% |
| 6 | 1 | 0.1% |
| 7 | 4 | |
| 8 | 9 | |
| 9 | 3 | 0.2% |
| 10 | 1 | 0.1% |
| 11 | 6 | |
| 12 | 2 | 0.2% |
| Value | Count | Frequency (%) |
| 80 | 52 | |
| 79 | 27 | |
| 78 | 40 | |
| 77 | 34 | |
| 76 | 26 | |
| 75 | 33 | |
| 74 | 28 | |
| 73 | 30 | |
| 72 | 20 | 1.5% |
| 71 | 21 |
| Distinct | 1156 |
|---|---|
| Distinct (%) | 89.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 68627.46827 |
| Minimum | 1 |
|---|---|
| Maximum | 243000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 10.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 17045.15 |
| Q1 | 42499.25 |
| median | 63831 |
| Q3 | 87676.25 |
| 95-th percentile | 138790.25 |
| Maximum | 243000 |
| Range | 242999 |
| Interquartile range (IQR) | 45177 |
Descriptive statistics
| Standard deviation | 37714.61526 |
|---|---|
| Coefficient of variation (CV) | 0.5495556839 |
| Kurtosis | 1.619734528 |
| Mean | 68627.46827 |
| Median Absolute Deviation (MAD) | 22525 |
| Skewness | 1.006002007 |
| Sum | 88666689 |
| Variance | 1422392204 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 43000 | 7 | 0.5% |
| 1 | 7 | 0.5% |
| 36000 | 7 | 0.5% |
| 59000 | 6 | 0.5% |
| 75000 | 6 | 0.5% |
| 61000 | 5 | 0.4% |
| 60000 | 5 | 0.4% |
| 68000 | 4 | 0.3% |
| 37000 | 4 | 0.3% |
| 52000 | 4 | 0.3% |
| Other values (1146) | 1237 |
| Value | Count | Frequency (%) |
| 1 | 7 | |
| 15 | 1 | 0.1% |
| 225 | 1 | 0.1% |
| 450 | 1 | 0.1% |
| 1500 | 1 | 0.1% |
| 3000 | 1 | 0.1% |
| 4000 | 1 | 0.1% |
| 5000 | 2 | 0.2% |
| 5278 | 1 | 0.1% |
| 5309 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 243000 | 1 | |
| 232940 | 1 | |
| 218118 | 1 | |
| 216000 | 1 | |
| 207114 | 1 | |
| 205000 | 1 | |
| 204250 | 1 | |
| 203254 | 1 | |
| 200732 | 1 | |
| 198167 | 1 |
| Distinct | 12 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 101.4287926 |
| Minimum | 69 |
|---|---|
| Maximum | 192 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 10.2 KiB |
Quantile statistics
| Minimum | 69 |
|---|---|
| 5-th percentile | 72 |
| Q1 | 86 |
| median | 110 |
| Q3 | 110 |
| 95-th percentile | 110 |
| Maximum | 192 |
| Range | 123 |
| Interquartile range (IQR) | 24 |
Descriptive statistics
| Standard deviation | 15.25747138 |
|---|---|
| Coefficient of variation (CV) | 0.1504254462 |
| Kurtosis | 9.106434146 |
| Mean | 101.4287926 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.086279095 |
| Sum | 131046 |
| Variance | 232.7904329 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=12)
| Value | Count | Frequency (%) |
| 110 | 745 | |
| 86 | 230 | 17.8% |
| 97 | 149 | 11.5% |
| 72 | 66 | 5.1% |
| 90 | 32 | 2.5% |
| 69 | 31 | 2.4% |
| 107 | 16 | 1.2% |
| 192 | 11 | 0.9% |
| 116 | 8 | 0.6% |
| 98 | 2 | 0.2% |
| Other values (2) | 2 | 0.2% |
| Value | Count | Frequency (%) |
| 69 | 31 | 2.4% |
| 71 | 1 | 0.1% |
| 72 | 66 | 5.1% |
| 73 | 1 | 0.1% |
| 86 | 230 | 17.8% |
| 90 | 32 | 2.5% |
| 97 | 149 | 11.5% |
| 98 | 2 | 0.2% |
| 107 | 16 | 1.2% |
| 110 | 745 |
| Value | Count | Frequency (%) |
| 192 | 11 | 0.9% |
| 116 | 8 | 0.6% |
| 110 | 745 | |
| 107 | 16 | 1.2% |
| 98 | 2 | 0.2% |
| 97 | 149 | 11.5% |
| 90 | 32 | 2.5% |
| 86 | 230 | 17.8% |
| 73 | 1 | 0.1% |
| 72 | 66 | 5.1% |
| Distinct | 13 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1576.677245 |
| Minimum | 1300 |
|---|---|
| Maximum | 16000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 10.2 KiB |
Quantile statistics
| Minimum | 1300 |
|---|---|
| 5-th percentile | 1300 |
| Q1 | 1400 |
| median | 1600 |
| Q3 | 1600 |
| 95-th percentile | 2000 |
| Maximum | 16000 |
| Range | 14700 |
| Interquartile range (IQR) | 200 |
Descriptive statistics
| Standard deviation | 443.6188712 |
|---|---|
| Coefficient of variation (CV) | 0.2813631469 |
| Kurtosis | 866.6884018 |
| Mean | 1576.677245 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 26.69622404 |
| Sum | 2037067 |
| Variance | 196797.7029 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=13)
| Value | Count | Frequency (%) |
| 1600 | 751 | |
| 1300 | 229 | 17.7% |
| 1400 | 149 | 11.5% |
| 2000 | 107 | 8.3% |
| 1900 | 28 | 2.2% |
| 1800 | 13 | 1.0% |
| 1587 | 4 | 0.3% |
| 1598 | 3 | 0.2% |
| 1995 | 2 | 0.2% |
| 1398 | 2 | 0.2% |
| Other values (3) | 4 | 0.3% |
| Value | Count | Frequency (%) |
| 1300 | 229 | 17.7% |
| 1332 | 2 | 0.2% |
| 1398 | 2 | 0.2% |
| 1400 | 149 | 11.5% |
| 1587 | 4 | 0.3% |
| 1598 | 3 | 0.2% |
| 1600 | 751 | |
| 1800 | 13 | 1.0% |
| 1900 | 28 | 2.2% |
| 1975 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 16000 | 1 | 0.1% |
| 2000 | 107 | 8.3% |
| 1995 | 2 | 0.2% |
| 1975 | 1 | 0.1% |
| 1900 | 28 | 2.2% |
| 1800 | 13 | 1.0% |
| 1600 | 751 | |
| 1598 | 3 | 0.2% |
| 1587 | 4 | 0.3% |
| 1400 | 149 | 11.5% |
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.2 KiB |
| 5 | |
|---|---|
| 3 | |
| 4 | |
| 2 | 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1292 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | 5 |
|---|---|
| 2nd row | 5 |
| 3rd row | 5 |
| 4th row | 3 |
| 5th row | 3 |
Common Values
| Value | Count | Frequency (%) |
| 5 | 608 | |
| 3 | 560 | |
| 4 | 123 | 9.5% |
| 2 | 1 | 0.1% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 5 | 608 | |
| 3 | 560 | |
| 4 | 123 | 9.5% |
| 2 | 1 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 5 | 608 | |
| 3 | 560 | |
| 4 | 123 | 9.5% |
| 2 | 1 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1292 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 5 | 608 | |
| 3 | 560 | |
| 4 | 123 | 9.5% |
| 2 | 1 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1292 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 5 | 608 | |
| 3 | 560 | |
| 4 | 123 | 9.5% |
| 2 | 1 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1292 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 5 | 608 | |
| 3 | 560 | |
| 4 | 123 | 9.5% |
| 2 | 1 | 0.1% |
Gears
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 10.2 KiB |
| 5 | |
|---|---|
| 6 | 42 |
| 3 | 2 |
| 4 | 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 1292 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | 5 |
|---|---|
| 2nd row | 5 |
| 3rd row | 5 |
| 4th row | 5 |
| 5th row | 5 |
Common Values
| Value | Count | Frequency (%) |
| 5 | 1247 | |
| 6 | 42 | 3.3% |
| 3 | 2 | 0.2% |
| 4 | 1 | 0.1% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 5 | 1247 | |
| 6 | 42 | 3.3% |
| 3 | 2 | 0.2% |
| 4 | 1 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 5 | 1247 | |
| 6 | 42 | 3.3% |
| 3 | 2 | 0.2% |
| 4 | 1 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1292 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 5 | 1247 | |
| 6 | 42 | 3.3% |
| 3 | 2 | 0.2% |
| 4 | 1 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1292 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 5 | 1247 | |
| 6 | 42 | 3.3% |
| 3 | 2 | 0.2% |
| 4 | 1 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1292 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 5 | 1247 | |
| 6 | 42 | 3.3% |
| 3 | 2 | 0.2% |
| 4 | 1 | 0.1% |
| Distinct | 12 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 87.25696594 |
| Minimum | 19 |
|---|---|
| Maximum | 283 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 10.2 KiB |
Quantile statistics
| Minimum | 19 |
|---|---|
| 5-th percentile | 64 |
| Q1 | 69 |
| median | 85 |
| Q3 | 85 |
| 95-th percentile | 185 |
| Maximum | 283 |
| Range | 264 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 40.97892647 |
|---|---|
| Coefficient of variation (CV) | 0.4696350145 |
| Kurtosis | 4.20118678 |
| Mean | 87.25696594 |
| Median Absolute Deviation (MAD) | 16 |
| Skewness | 1.985754622 |
| Sum | 112736 |
| Variance | 1679.272415 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=12)
| Value | Count | Frequency (%) |
| 85 | 555 | |
| 69 | 502 | |
| 185 | 88 | 6.8% |
| 19 | 63 | 4.9% |
| 234 | 18 | 1.4% |
| 100 | 18 | 1.4% |
| 210 | 16 | 1.2% |
| 64 | 15 | 1.2% |
| 197 | 12 | 0.9% |
| 283 | 2 | 0.2% |
| Other values (2) | 3 | 0.2% |
| Value | Count | Frequency (%) |
| 19 | 63 | 4.9% |
| 40 | 1 | 0.1% |
| 64 | 15 | 1.2% |
| 69 | 502 | |
| 72 | 2 | 0.2% |
| 85 | 555 | |
| 100 | 18 | 1.4% |
| 185 | 88 | 6.8% |
| 197 | 12 | 0.9% |
| 210 | 16 | 1.2% |
| Value | Count | Frequency (%) |
| 283 | 2 | 0.2% |
| 234 | 18 | 1.4% |
| 210 | 16 | 1.2% |
| 197 | 12 | 0.9% |
| 185 | 88 | 6.8% |
| 100 | 18 | 1.4% |
| 85 | 555 | |
| 72 | 2 | 0.2% |
| 69 | 502 | |
| 64 | 15 | 1.2% |
| Distinct | 56 |
|---|---|
| Distinct (%) | 4.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1072.002322 |
| Minimum | 1000 |
|---|---|
| Maximum | 1480 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 10.2 KiB |
Quantile statistics
| Minimum | 1000 |
|---|---|
| 5-th percentile | 1015 |
| Q1 | 1040 |
| median | 1070 |
| Q3 | 1085 |
| 95-th percentile | 1140 |
| Maximum | 1480 |
| Range | 480 |
| Interquartile range (IQR) | 45 |
Descriptive statistics
| Standard deviation | 50.29384602 |
|---|---|
| Coefficient of variation (CV) | 0.04691579952 |
| Kurtosis | 12.6753301 |
| Mean | 1072.002322 |
| Median Absolute Deviation (MAD) | 30 |
| Skewness | 2.511349803 |
| Sum | 1385027 |
| Variance | 2529.470947 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1075 | 178 | 13.8% |
| 1050 | 144 | 11.1% |
| 1015 | 109 | 8.4% |
| 1035 | 95 | 7.4% |
| 1070 | 78 | 6.0% |
| 1025 | 63 | 4.9% |
| 1065 | 48 | 3.7% |
| 1080 | 39 | 3.0% |
| 1060 | 38 | 2.9% |
| 1100 | 38 | 2.9% |
| Other values (46) | 462 |
| Value | Count | Frequency (%) |
| 1000 | 16 | 1.2% |
| 1010 | 4 | 0.3% |
| 1015 | 109 | |
| 1020 | 9 | 0.7% |
| 1025 | 63 | |
| 1030 | 21 | 1.6% |
| 1035 | 95 | |
| 1040 | 31 | 2.4% |
| 1045 | 27 | 2.1% |
| 1050 | 144 |
| Value | Count | Frequency (%) |
| 1480 | 3 | |
| 1320 | 3 | |
| 1280 | 1 | 0.1% |
| 1275 | 2 | 0.2% |
| 1270 | 3 | |
| 1265 | 1 | 0.1% |
| 1260 | 4 | |
| 1255 | 6 | |
| 1245 | 3 | |
| 1205 | 4 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| Unnamed: 0 | Price | Age0804 | KM | HP | cc | Doors | Gears | QuarterlyTax | Weight | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 16450 | 20 | 22588 | 97 | 1400 | 5 | 5 | 85 | 1110 |
| 1 | 1 | 7950 | 75 | 57144 | 110 | 1600 | 5 | 5 | 85 | 1070 |
| 2 | 2 | 10950 | 59 | 79660 | 86 | 1300 | 5 | 5 | 85 | 1065 |
| 3 | 3 | 8950 | 65 | 60000 | 86 | 1300 | 3 | 5 | 69 | 1015 |
| 4 | 4 | 9950 | 55 | 44537 | 97 | 1400 | 3 | 5 | 69 | 1025 |
| 5 | 5 | 16750 | 24 | 25563 | 110 | 1600 | 3 | 5 | 19 | 1065 |
| 6 | 6 | 8900 | 59 | 36954 | 110 | 1600 | 3 | 5 | 69 | 1050 |
| 7 | 7 | 7450 | 76 | 154900 | 72 | 2000 | 5 | 5 | 185 | 1140 |
| 8 | 8 | 7250 | 74 | 130025 | 110 | 1600 | 3 | 5 | 69 | 1050 |
| 9 | 9 | 7950 | 68 | 57565 | 86 | 1300 | 5 | 5 | 69 | 1035 |
Last rows
| Unnamed: 0 | Price | Age0804 | KM | HP | cc | Doors | Gears | QuarterlyTax | Weight | |
|---|---|---|---|---|---|---|---|---|---|---|
| 1282 | 1282 | 16250 | 19 | 29441 | 97 | 1400 | 5 | 5 | 85 | 1110 |
| 1283 | 1283 | 9450 | 52 | 104805 | 97 | 1400 | 3 | 5 | 69 | 1025 |
| 1284 | 1284 | 10950 | 57 | 40214 | 86 | 1300 | 3 | 5 | 69 | 1025 |
| 1285 | 1285 | 12750 | 33 | 27240 | 110 | 1600 | 5 | 5 | 85 | 1075 |
| 1286 | 1286 | 8000 | 58 | 70560 | 110 | 1600 | 3 | 5 | 69 | 1050 |
| 1287 | 1287 | 8950 | 61 | 81170 | 110 | 1600 | 4 | 5 | 69 | 1040 |
| 1288 | 1288 | 6750 | 69 | 60050 | 110 | 1600 | 3 | 5 | 69 | 1050 |
| 1289 | 1289 | 7400 | 75 | 74096 | 110 | 1600 | 3 | 5 | 69 | 1050 |
| 1290 | 1290 | 11450 | 50 | 50400 | 110 | 1600 | 5 | 5 | 85 | 1080 |
| 1291 | 1291 | 8500 | 74 | 66718 | 110 | 1600 | 3 | 5 | 69 | 1050 |